To investigate ride behavior differences between casual and member users and uncover temporal and spatial patterns in ride activity, a comprehensive and well-structured database is essential. The analysis focuses on understanding how ride patterns vary across time—daily, weekly, and seasonally—and space—stations and routes—while identifying trends in ride duration, station popularity, and overall demand. These insights are critical for guiding Divvy’s operational decisions and marketing strategies.
The source data for this project consists of 12 monthly Divvy trip datasets for the year 2024, containing ride-level information such as ride identifiers, timestamps, start and end stations, and user type (casual vs. member). To efficiently support analysis, a relational database will be designed to:
By implementing this database, analysts will be able to efficiently query and aggregate data, uncover patterns in ride behavior, and generate actionable insights for Divvy’s operational planning and marketing initiatives.
This section establishes a secure connection to the PostgreSQL
database that will be used throughout the analysis. The connection
details, such as host, database name, username, password, and port, are
stored in a configuration file (db_config.ini) for better
security and separation from the code. The connection attempt is wrapped
in a tryCatch block so that any errors produce a clear and
informative message rather than stopping execution silently. Once the
connection is successfully created, it is registered with
knitr, which allows all subsequent SQL chunks in the R
Markdown document to automatically use this connection without
explicitly passing it each time.
# Read config
config <- read.ini("resources/db_config.ini")
db <- config$postgresql
# Safe database connection
tryCatch({
con <- dbConnect(
Postgres(),
host = db$host,
dbname = db$database,
user = db$user,
password = db$password,
port = as.integer(db$port)
)
}, error = function(e) {
stop("Database connection failed: ", e$message)
})
# Register connection for SQL chunks
knitr::opts_chunk$set(connection = con)Before loading any data, the code ensures that a dedicated schema
named divvy exists in the PostgreSQL database. A schema
provides a logical container for all tables, indexes, and views related
to this project, keeping Divvy trip data organized and separate from
other unrelated datasets. By using the IF NOT EXISTS
clause, the code avoids errors if the schema is already present,
ensuring idempotency and safe re-runs of the script. Create database
schema to the database
This part of the workflow brings the raw Divvy trip data for the year
2024 into R. Each month’s data is stored in a CSV file, and the code
dynamically generates file paths for all 12 months. Using a loop, the
files are read into R one by one and saved into a list object named
divvy_data. Each dataset is keyed by its corresponding
month (e.g., "january", "february", etc.),
which makes it easy to reference and process each month individually
later in the pipeline. At this point, the raw data is prepared in memory
and ready for structured loading into the database.
# Define year and months
year <- "2024"
months <- sprintf("%02d", 1:12)
# Initialize list to store data
divvy_data <- list()
# Loop through months and read each file
for (m in months) {
file_path <- paste0("resources/data/", year, m, "-divvy-tripdata.csv")
month_name <- tolower(format(as.Date(paste0(year, "-", m, "-01")), "%B"))
divvy_data[[month_name]] <- read_csv(file_path, show_col_types = FALSE)
}Once the monthly data is loaded into R, the next step is to create
corresponding tables in the PostgreSQL database. For each month, any
existing version of the table is dropped to avoid conflicts. The raw
data is first written into a temporary staging table, which serves as a
safe space for cleaning and transformation. From this staging table, a
final monthly table is created with duplicate rides removed by keeping
only the most recent entry for each ride_id. A primary key
constraint is then added to enforce ride uniqueness at the database
level. Finally, the staging table is dropped to conserve storage and
maintain a clean schema. This process ensures that each monthly table is
consistent, deduplicated, and properly constrained.
# Months we want to load
months <- names(divvy_data) # "january", "february", ... "december"
for (month_name in months) {
# Drop old table if exists
dbExecute(con, glue("DROP TABLE IF EXISTS divvy.{month_name} CASCADE;"))
# Write raw data into a temp table (staging area)
dbWriteTable(
conn = con,
name = DBI::Id(schema = "divvy", table = paste0(month_name, "_staging")),
value = divvy_data[[month_name]],
overwrite = TRUE,
row.names = FALSE
)
# Create final deduplicated + constrained table in one step
dbExecute(con, glue("
CREATE TABLE divvy.{month_name} AS
SELECT DISTINCT ON (ride_id) *
FROM divvy.{month_name}_staging
ORDER BY ride_id, ended_at DESC;
"))
# Add constrained table in one step
dbExecute(con, glue("
ALTER TABLE divvy.{month_name}
ADD CONSTRAINT {month_name}_pk PRIMARY KEY (ride_id);
"))
# Drop the staging table
dbExecute(con, glue("DROP TABLE divvy.{month_name}_staging;"))
}To ensure consistent and accurate analysis, this step enforces
correct data types on time-related fields. The columns
started_at and ended_at are explicitly cast to
the TIMESTAMP type. This guarantees that all time-based
operations, such as extracting hours or calculating trip durations, will
work reliably and without type conversion errors.
Missing values in the dataset can cause errors during analysis or
create misleading results. To address this, the code replaces
NULL values in station name and station ID columns with
sensible defaults. Unknown station names are labeled as
"Unknown", while missing IDs are set to "NA".
This ensures that every row has valid entries for key fields,
simplifying aggregations, joins, and groupings later on.
for (month_name in months) {
handle_null_sql <- glue("
UPDATE divvy.{month_name}
SET start_station_name = COALESCE(start_station_name, 'Unknown'),
end_station_name = COALESCE(end_station_name, 'Unknown'),
start_station_id = COALESCE(start_station_id, 'NA'),
end_station_id = COALESCE(end_station_id, 'NA');
")
dbExecute(con, handle_null_sql)
}Station names often contain inconsistencies such as extra spaces or variations in capitalization. This step cleans those fields by trimming whitespace and converting names to proper case formatting. The result is a consistent representation of station names across all records, which prevents problems when grouping or filtering data by station and improves overall readability of the dataset.
To enrich the dataset for analysis, new columns are added to capture
additional information about each ride. These features include trip
duration in minutes (ride_length_min), the day of the week
when the ride started (day_of_week), whether the ride
occurred on a weekend (is_weekend), and the hour of the day
when the ride began (start_hour). These engineered features
make it possible to perform richer time-based, behavioral, and
segmentation analyses without modifying the raw data.
for (month_name in months) {
extra_feature_sql <- glue("
ALTER TABLE divvy.{month_name}
ADD COLUMN IF NOT EXISTS ride_length_min NUMERIC,
ADD COLUMN IF NOT EXISTS day_of_week TEXT,
ADD COLUMN IF NOT EXISTS is_weekend BOOLEAN,
ADD COLUMN IF NOT EXISTS start_hour INT;
")
dbExecute(con, extra_feature_sql)
}After adding the new columns, this step populates them with values
derived from existing data. The day_of_week is extracted
from the start time, and is_weekend is set to true for
rides taken on Saturdays and Sundays. The start_hour is
calculated from the ride’s start time, allowing for hourly usage
analysis. Finally, the ride_length_min column is computed
as the difference between the ride’s start and end times, converted into
minutes. Together, these transformations create a dataset that is far
more informative and ready for exploratory analysis.
for (month_name in months) {
update_feature_sql <- glue("
UPDATE divvy.{month_name}
SET day_of_week = TRIM(TO_CHAR(started_at, 'Day')),
is_weekend = EXTRACT(ISODOW FROM started_at) IN (6,7),
start_hour = EXTRACT(HOUR FROM started_at),
ride_length_min = EXTRACT(EPOCH FROM (ended_at - started_at)) / 60.0;
")
dbExecute(con, update_feature_sql)
}The final step consolidates all 12 monthly tables into one master
table called all_trips. If an older version of the table
exists, it is dropped to ensure a fresh build. The new table is then
created by combining all the monthly tables using
UNION ALL, which preserves all rows from each month without
removing duplicates. The result is a comprehensive dataset that covers
the entire year of 2024 in a single table. Having one unified table
simplifies queries, reporting, and advanced analytics, since analysts
can work with one central source rather than juggling multiple monthly
datasets.
# Drop if already exists, then create fresh table
dbExecute(con, glue("DROP TABLE IF EXISTS divvy.all_trips CASCADE;"))## [1] 0
all_trips_query <- paste0(
"CREATE TABLE IF NOT EXISTS divvy.all_trips AS\n",
paste(
sprintf("SELECT * FROM divvy.%s", months),
collapse = "\nUNION ALL\n"
)
)
dbExecute(con, all_trips_query)## [1] 5860568
To improve query performance on the divvy.all_trips
table, several indexes have been created. These indexes are aligned with
the analytical views, ensuring that frequent filtering, grouping, and
aggregation operations can run efficiently at scale.
The index on started_at
(idx_all_trips_started_at) is designed to speed up
time-based queries. Many of the analytical views, such as
vw_daily_trends, vw_hourly_usage, and
vw_monthly_trends, rely on filtering, grouping, or
truncating ride start times. By indexing this column, the database can
quickly locate rows that fall within a given date or time range, which
is particularly important for large datasets covering multiple
years.
The index on member_casual
(idx_all_trips_member_casual) supports queries that segment
rides by user type. Views like vw_user_type,
vw_daily_trends, and vw_monthly_trends
frequently group or filter by this column. Since it has relatively low
cardinality (only two values: “member” and “casual”), this index is
lightweight but still useful when combined with other indexed fields in
composite indexes.
Indexes on start_station_id and
end_station_id (idx_all_trips_start_station
and idx_all_trips_end_station) are intended to optimize
spatial analysis. These columns are frequently used in views like
vw_station_popularity and vw_routes, where
queries involve grouping by station or counting trips associated with
specific locations. With these indexes in place, station-based lookups
and aggregations become faster, especially when analyzing popular
stations or routes.
Two multi-column indexes are introduced to support queries that involve both segmentation and time or both station identifiers.
The idx_trips_user_date index combines
member_casual and started_at, which is
especially beneficial for queries in vw_daily_trends and
vw_monthly_trends. These queries often group data by both
user type and time, so the composite index allows for efficient
retrieval across both dimensions simultaneously.
The idx_trips_routes index combines
start_station_id and end_station_id, aligning
directly with the vw_routes view. Since this view
summarizes trips between pairs of stations, the composite index
accelerates grouping and aggregation by route.
The index on ride duration (idx_all_trips_duration) is
created using an expression that calculates the difference between
ended_at and started_at in minutes. This
supports queries that filter or group based on ride length, such as
identifying unusually long or short rides. Although expression indexes
are more specialized, they can significantly improve performance for
duration-based analysis, which otherwise requires recalculating values
for every query execution.
The following views provide a structured way to analyze Divvy
bike-share data for 2024. They are designed to answer specific
analytical and business questions about usage patterns, user behavior,
and demand across time and space. Each view builds on the consolidated
all_trips table and serves as a reusable component for
reporting and exploratory analysis.
The vw_all_trips view acts as the primary entry point
for analyzing the dataset. It is simply a direct reference to the
all_trips table, ensuring consistency when working with
derived views or queries. This provides analysts with a reliable way to
access the full dataset without repeatedly referencing the base
table.
This view provides a quick comparison of ride behavior between casual riders and members. It summarizes the number of rides taken by each user type and calculates their average ride duration in minutes. The purpose of this view is to highlight differences in usage patterns, which can inform marketing strategies, membership growth initiatives, and operational planning.
The vw_daily_trends view tracks ride activity on a daily
basis, broken down by user type. It captures both the total number of
rides and the average duration per day. This view is particularly useful
for detecting seasonal patterns, comparing weekday versus weekend usage,
and identifying demand fluctuations over time.
This view analyzes rides by hour of the day and distinguishes between casual and member users. It provides counts of rides for each hour, enabling insights into peak demand periods such as commuting hours or recreational usage times. The results from this view are helpful for understanding daily demand cycles and planning resource allocation like bike availability or re-balancing.
The vw_station_popularity view ranks stations based on
the number of departures. It identifies the most frequently used
starting locations by counting departures per station and assigning a
popularity rank. This analysis helps uncover hotspots in the network and
provides guidance for station placement, expansion, or infrastructure
improvements.
This view summarizes trips between pairs of stations, providing the total number of rides for each route. It enables the identification of the most popular travel flows and routes across the network. Such insights are valuable for planning route optimization, understanding commuting patterns, and supporting operational decisions like station balancing or targeted promotions.
The vw_monthly_trends view aggregates ride data on a
monthly basis, broken down by user type. It reports both the number of
rides and the average duration per month, making it ideal for analyzing
seasonal demand, long-term trends, and the effects of membership or
marketing campaigns. This view supports high-level strategic
decision-making by highlighting shifts in user behavior over time.
CREATE OR REPLACE VIEW divvy.vw_weekly_patterns AS
SELECT
member_casual,
EXTRACT(DOW FROM started_at) AS day_of_week, -- 0=Sunday … 6=Saturday
COUNT(*) AS total_rides,
ROUND(AVG(EXTRACT(EPOCH FROM (ended_at - started_at)) / 60), 2) AS avg_duration_min
FROM divvy.all_trips
GROUP BY member_casual, EXTRACT(DOW FROM started_at)
ORDER BY member_casual, day_of_week;CREATE OR REPLACE VIEW divvy.vw_seasonal_trends AS
SELECT
member_casual,
CASE
WHEN EXTRACT(MONTH FROM started_at) IN (12, 1, 2) THEN 'Winter'
WHEN EXTRACT(MONTH FROM started_at) IN (3, 4, 5) THEN 'Spring'
WHEN EXTRACT(MONTH FROM started_at) IN (6, 7, 8) THEN 'Summer'
WHEN EXTRACT(MONTH FROM started_at) IN (9, 10, 11) THEN 'Fall'
END AS season,
COUNT(*) AS total_rides,
ROUND(AVG(EXTRACT(EPOCH FROM (ended_at - started_at)) / 60), 2) AS avg_duration_min
FROM divvy.all_trips
GROUP BY member_casual, season
ORDER BY member_casual, season;CREATE OR REPLACE VIEW divvy.vw_roundtrips AS
SELECT
member_casual,
CASE WHEN start_station_id = end_station_id THEN 'Roundtrip'
ELSE 'One-way'
END AS trip_type,
COUNT(*) AS total_rides,
ROUND(AVG(EXTRACT(EPOCH FROM (ended_at - started_at)) / 60), 2) AS avg_duration_min
FROM divvy.all_trips
GROUP BY member_casual, trip_type
ORDER BY member_casual, trip_type;CREATE OR REPLACE VIEW divvy.vw_member_casual_ratio AS
SELECT
DATE_TRUNC('day', started_at) AS ride_date,
SUM(CASE WHEN member_casual = 'member' THEN 1 ELSE 0 END) AS member_rides,
SUM(CASE WHEN member_casual = 'casual' THEN 1 ELSE 0 END) AS casual_rides,
ROUND(
(SUM(CASE WHEN member_casual = 'member' THEN 1 ELSE 0 END)::NUMERIC /
NULLIF(SUM(CASE WHEN member_casual IN ('member','casual') THEN 1 ELSE 0 END), 0)
) * 100, 2
) AS pct_member,
ROUND(
(SUM(CASE WHEN member_casual = 'casual' THEN 1 ELSE 0 END)::NUMERIC /
NULLIF(SUM(CASE WHEN member_casual IN ('member','casual') THEN 1 ELSE 0 END), 0)
) * 100, 2
) AS pct_casual
FROM divvy.all_trips
GROUP BY ride_date
ORDER BY ride_date;This query calculates the total number of rides recorded in the
dataset. By simply counting all rows in the divvy.all_trips
table, we establish the size of the dataset and gain an overall sense of
its scale. This figure provides important context for interpreting
future analyses, as it allows us to measure the proportion of rides that
fall into different categories and track patterns against the total
ridership base.
| count |
|---|
| 5860568 |
This query examines the dataset for missing or null entries in key
fields such as start_station_name,
end_station_name, and rideable_type. By
summing all the null values in each of these columns, we can identify
the extent to which the dataset is complete or suffers from data gaps.
Missing values may arise from incomplete trip records, technical errors,
or limitations in the data collection system. Understanding how
widespread missing values are is essential, since they can bias results
and must often be handled through cleaning, filtering, or imputation
before further analysis.
SELECT
SUM(CASE WHEN start_station_name IS NULL THEN 1 ELSE 0 END)
AS missing_start_station,
SUM(CASE WHEN end_station_name IS NULL THEN 1 ELSE 0 END)
AS missing_end_station,
SUM(CASE WHEN rideable_type IS NULL THEN 1 ELSE 0 END)
AS missing_ride_type
FROM divvy.all_trips;| missing_start_station | missing_end_station | missing_ride_type |
|---|---|---|
| 0 | 0 | 0 |
This query investigates whether the dataset contains duplicate
records based on the ride_id field, which is supposed to
uniquely identify each trip. By grouping rides by their IDs and checking
for counts greater than one, it highlights any instances where the same
ride appears multiple times. Detecting duplicates is an important part
of data quality assurance, as repeated records can inflate counts and
distort statistical measures if not addressed properly.
| ride_id | count |
|---|---|
| F74208AD3A0B73CB | 2 |
| DF457AB27F30DC3D | 2 |
| 0354FD0756337B59 | 2 |
| 7FD8876225A67920 | 2 |
| CF25E56A43052CFF | 2 |
| 44CC4F2EBA96A21E | 2 |
| 29F756066AB38269 | 2 |
| 3CE5E2EB3B4E1999 | 2 |
| 439F3F47987B3C29 | 2 |
| 07DBFDA3C91006AE | 2 |
This query provides descriptive statistics on trip durations while filtering out extreme outliers, keeping only rides that last between one minute and twenty-four hours. It calculates the average, median, and 90th percentile ride length for both members and casual riders. These summary statistics are useful for comparing behavioral patterns between the two groups, such as whether casual riders tend to take longer trips than members. Filtering the data ensures that anomalies, like extremely short or excessively long rides, do not skew the results.
SELECT member_casual,
AVG(ride_length_min) AS avg_duration,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY ride_length_min)
AS median_duration,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY ride_length_min)
AS perc90_duration
FROM divvy.all_trips
WHERE ride_length_min BETWEEN 1 AND 1440 -- filter outliers (1 min – 24 hrs)
GROUP BY member_casual;| member_casual | avg_duration | median_duration | perc90_duration |
|---|---|---|---|
| casual | 21.56863 | 12.43048 | 43.70709 |
| member | 12.40836 | 8.84685 | 24.07167 |
This query explores ridership patterns by hour of the day. By extracting the hour from the ride’s start time and grouping rides accordingly, it shows how ridership varies throughout the day for both members and casual users. The results provide insights into daily usage cycles, highlighting peaks during commuting hours for members or leisure times for casual riders. Understanding these hourly patterns is valuable for scheduling, resource allocation, and operational planning.
SELECT member_casual,
EXTRACT(HOUR FROM started_at) AS hour,
COUNT(*) AS rides
FROM divvy.all_trips
GROUP BY member_casual, hour
ORDER BY hour;| member_casual | hour | rides |
|---|---|---|
| casual | 0 | 35460 |
| member | 0 | 32771 |
| casual | 1 | 23740 |
| member | 1 | 20098 |
| casual | 2 | 14751 |
| member | 2 | 11397 |
| casual | 3 | 8204 |
| member | 3 | 7828 |
| casual | 4 | 6454 |
| member | 4 | 8717 |
This query compares ridership between weekdays and weekends by classifying each ride based on the day of the week it occurred. By grouping trips into either “Weekday” or “Weekend,” and separating them by membership type, the analysis reveals distinct usage patterns. Typically, casual riders are more active on weekends for leisure purposes, while members are more active on weekdays, often reflecting commuting behavior. These insights highlight the different roles the service plays for various user groups.
SELECT member_casual,
CASE WHEN EXTRACT(ISODOW FROM started_at)
IN (6,7) THEN 'Weekend' ELSE 'Weekday' END AS day_type,
COUNT(*) AS rides
FROM divvy.all_trips
GROUP BY member_casual, day_type;| member_casual | day_type | rides |
|---|---|---|
| member | Weekday | 2811966 |
| casual | Weekday | 1336304 |
| casual | Weekend | 815354 |
| member | Weekend | 896944 |
This query tracks the number of rides on a month-by-month basis, with results broken down by membership type. Converting the trip start time into a “YYYY-MM” format enables us to observe seasonality and long-term trends in ridership. The analysis may reveal seasonal peaks during warmer months, declines during colder months, or long-term growth in user adoption. Segmenting by membership type further clarifies whether casual or member riders drive these seasonal fluctuations.
SELECT member_casual,
TO_CHAR(started_at, 'YYYY-MM') AS month,
COUNT(*) AS rides
FROM divvy.all_trips
GROUP BY member_casual, month
ORDER BY month;| member_casual | month | rides |
|---|---|---|
| casual | 2024-01 | 24460 |
| member | 2024-01 | 120413 |
| casual | 2024-02 | 47163 |
| member | 2024-02 | 176001 |
| casual | 2024-03 | 82550 |
| member | 2024-03 | 219137 |
| casual | 2024-04 | 131810 |
| member | 2024-04 | 283215 |
| casual | 2024-05 | 231150 |
| member | 2024-05 | 378554 |
This query identifies the ten most frequently used starting stations by counting the number of trips that begin at each location. Ranking stations by ride volume highlights the areas of highest demand within the network. These stations often serve as transportation hubs, tourist destinations, or dense residential and commercial areas. The results provide a foundation for planning bike availability, expanding infrastructure, or improving service in the most active locations.
SELECT start_station_name, COUNT(*) AS rides
FROM divvy.all_trips
GROUP BY start_station_name
ORDER BY rides DESC
LIMIT 10;| start_station_name | rides |
|---|---|
| Unknown | 1073951 |
| Streeter Dr & Grand Ave | 66020 |
| Dusable Lake Shore Dr & Monroe St | 43969 |
| Kingsbury St & Kinzie St | 39659 |
| Michigan Ave & Oak St | 39634 |
| Dusable Lake Shore Dr & North Blvd | 39621 |
| Clark St & Elm St | 35543 |
| Clinton St & Washington Blvd | 34512 |
| Millennium Park | 33169 |
| Clinton St & Madison St | 33024 |
This query determines the most frequently traveled routes by analyzing pairs of starting and ending stations. By counting and ranking the number of rides for each station-to-station connection, it identifies the most popular travel corridors in the system. These routes can indicate regular commuting paths or heavily visited leisure destinations. The results are useful for understanding rider preferences, network efficiency, and areas where targeted operational support may be needed.
SELECT start_station_name, end_station_name, COUNT(*) AS rides
FROM divvy.all_trips
GROUP BY start_station_name, end_station_name
ORDER BY rides DESC
LIMIT 10;| start_station_name | end_station_name | rides |
|---|---|---|
| Unknown | Unknown | 526345 |
| Streeter Dr & Grand Ave | Streeter Dr & Grand Ave | 9668 |
| Dusable Lake Shore Dr & Monroe St | Dusable Lake Shore Dr & Monroe St | 7917 |
| Calumet Ave & 33rd St | State St & 33rd St | 5921 |
| State St & 33rd St | Calumet Ave & 33rd St | 5913 |
| Dusable Lake Shore Dr & Monroe St | Streeter Dr & Grand Ave | 5692 |
| Ellis Ave & 60th St | Ellis Ave & 55th St | 5437 |
| Michigan Ave & Oak St | Michigan Ave & Oak St | 5262 |
| Ellis Ave & 60th St | University Ave & 57th St | 5187 |
| Ellis Ave & 55th St | Ellis Ave & 60th St | 5137 |
This query examines how different types of bikes are used by members
versus casual riders. By grouping rides by both membership status and
rideable_type, it counts how many rides fall into each
category. This analysis highlights preferences for specific vehicle
types, such as whether casual users favor electric bikes while members
prefer classic bikes. Understanding these patterns can help inform
operational decisions, fleet composition, and marketing strategies
tailored to user groups.
SELECT member_casual, rideable_type, COUNT(*) AS rides
FROM divvy.all_trips
GROUP BY member_casual, rideable_type;| member_casual | rideable_type | rides |
|---|---|---|
| casual | classic_bike | 974966 |
| casual | electric_bike | 1091477 |
| casual | electric_scooter | 85215 |
| member | classic_bike | 1760670 |
| member | electric_bike | 1889118 |
| member | electric_scooter | 59122 |
This query identifies the stations with the largest mismatch between departures and arrivals. It calculates a “net outflow” by subtracting the number of trips ending at a station from the number of trips starting there. Stations with high positive values experience more departures than arrivals, leading to potential bike shortages, while negative values indicate stations where too many bikes accumulate. These results are crucial for operational planning, as they highlight the locations most in need of rebalancing to maintain service reliability and meet rider demand.
SELECT start_station_name AS station,
COUNT(*) FILTER (WHERE start_station_name IS NOT NULL) -
COUNT(*) FILTER (WHERE end_station_name IS NOT NULL) AS net_outflow
FROM divvy.all_trips
GROUP BY station
ORDER BY net_outflow DESC
LIMIT 10;| station | net_outflow |
|---|---|
| 21st St & Pulaski Rd | 0 |
| 63rd St Beach | 0 |
| 900 W Harrison St | 0 |
| Aberdeen St & 103rd St | 0 |
| Aberdeen St & Jackson Blvd | 0 |
| Aberdeen St & Monroe St | 0 |
| Aberdeen St & Randolph St | 0 |
| Ada St & 113th Place | 0 |
| Ada St & 113th St | 0 |
| 2112 W Peterson Ave | 0 |